Next: SMIE Tricks, Previous: SMIE Grammar, Up: SMIE [Contents][Index]
SMIE comes with a predefined lexical analyzer which uses
syntax tables in the following way: any sequence of characters
that have word or symbol syntax is considered a token, and so is
any sequence of characters that have punctuation syntax. This
default lexer is often a good starting point but is rarely
actually correct for any given language. For example, it will
consider "2,+3" to be composed of 3 tokens:
"2", ",+", and "3".
To describe the lexing rules of your language to SMIE, you need 2 functions, one to fetch the next token, and another to fetch the previous token. Those functions will usually first skip whitespace and comments and then look at the next chunk of text to see if it is a special token. If so it should skip the token and return a description of this token. Usually this is simply the string extracted from the buffer, but it can be anything you want. For example:
(defvar sample-keywords-regexp
(regexp-opt '("+" "*" "," ";" ">" ">=" "<" "<=" ":=" "=")))
(defun sample-smie-forward-token ()
(forward-comment (point-max))
(cond
((looking-at sample-keywords-regexp)
(goto-char (match-end 0))
(match-string-no-properties 0))
(t (buffer-substring-no-properties
(point)
(progn (skip-syntax-forward "w_")
(point))))))
(defun sample-smie-backward-token ()
(forward-comment (- (point)))
(cond
((looking-back sample-keywords-regexp (- (point) 2) t)
(goto-char (match-beginning 0))
(match-string-no-properties 0))
(t (buffer-substring-no-properties
(point)
(progn (skip-syntax-backward "w_")
(point))))))
Notice how those lexers return the empty string when in front
of parentheses. This is because SMIE automatically takes care of
the parentheses defined in the syntax table. More specifically if
the lexer returns nil or an empty string, SMIE tries
to handle the corresponding text as a sexp according to syntax
tables.
Next: SMIE Tricks, Previous: SMIE Grammar, Up: SMIE [Contents][Index]